Overview

Dataset statistics

Number of variables9
Number of observations4385
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory342.6 KiB
Average record size in memory80.0 B

Variable types

Numeric9

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
FREQUENCIA BOMBA 2 is highly overall correlated with VAZÃO DE RECALQUE - FT03 and 1 other fieldsHigh correlation
NIVEL DO RESERVATÓRIO - LT01 is highly overall correlated with VAZÃO DE RECALQUE - FT03 and 1 other fieldsHigh correlation
VAZÃO DE RECALQUE - FT03 is highly overall correlated with FREQUENCIA BOMBA 1 and 5 other fieldsHigh correlation
PRESSÃO DE SUCÇÃO - PT01 is highly overall correlated with NIVEL DO RESERVATÓRIO - LT01 and 3 other fieldsHigh correlation
PRESSÃO DE RECALQUE - PT02 is highly overall correlated with FREQUENCIA BOMBA 1 and 4 other fieldsHigh correlation
FREQUENCIA BOMBA 1 is highly overall correlated with VAZÃO DE GRAVIDADE - FT02 and 2 other fieldsHigh correlation
VAZÃO DE GRAVIDADE - FT02 is highly overall correlated with FREQUENCIA BOMBA 1 and 3 other fieldsHigh correlation
FREQUENCIA BOMBA 1 has 392 (8.9%) zerosZeros
FREQUENCIA BOMBA 2 has 1136 (25.9%) zerosZeros
FREQUENCIA BOMBA 3 has 3687 (84.1%) zerosZeros
VAZÃO DE ENTRADA- FT01 has 795 (18.1%) zerosZeros
VAZÃO DE GRAVIDADE - FT02 has 272 (6.2%) zerosZeros
PRESSÃO DE RECALQUE - PT02 has 79 (1.8%) zerosZeros

Reproduction

Analysis started2022-12-15 14:44:36.332236
Analysis finished2022-12-15 14:44:57.748445
Duration21.42 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

FREQUENCIA BOMBA 1
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct1196
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.939888
Minimum0
Maximum59.988281
Zeros392
Zeros (%)8.9%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:44:57.928388image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q157.842842
median57.988792
Q357.988792
95-th percentile58.317091
Maximum59.988281
Range59.988281
Interquartile range (IQR)0.14595032

Descriptive statistics

Standard deviation17.142205
Coefficient of variation (CV)0.33003932
Kurtosis5.1385932
Mean51.939888
Median Absolute Deviation (MAD)0
Skewness-2.6473655
Sum227756.41
Variance293.8552
MonotonicityNot monotonic
2022-12-15T11:44:58.160759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57.98879242 2632
60.0%
0 392
 
8.9%
59.98828125 45
 
1.0%
58.01076508 13
 
0.3%
49.99084473 13
 
0.3%
58.05471039 12
 
0.3%
44.99212646 11
 
0.3%
29.99230957 10
 
0.2%
58.12062836 9
 
0.2%
34.99102783 9
 
0.2%
Other values (1186) 1239
28.3%
ValueCountFrequency (%)
0 392
8.9%
0.01275072433 1
 
< 0.1%
0.01913495362 1
 
< 0.1%
0.02551918104 1
 
< 0.1%
0.03190341219 1
 
< 0.1%
0.04066173732 1
 
< 0.1%
0.0472676903 1
 
< 0.1%
0.06109762564 1
 
< 0.1%
0.07513397187 1
 
< 0.1%
0.08153351396 1
 
< 0.1%
ValueCountFrequency (%)
59.98828125 45
1.0%
59.98095703 2
 
< 0.1%
59.9793396 1
 
< 0.1%
59.97729492 2
 
< 0.1%
59.97363281 1
 
< 0.1%
59.94200516 1
 
< 0.1%
59.94067383 1
 
< 0.1%
59.92602539 1
 
< 0.1%
59.92236328 2
 
< 0.1%
59.89031601 1
 
< 0.1%

FREQUENCIA BOMBA 2
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct3190
Distinct (%)72.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.81337
Minimum0
Maximum59.991943
Zeros1136
Zeros (%)25.9%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:44:58.415680image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median34.908695
Q338.02359
95-th percentile49.226396
Maximum59.991943
Range59.991943
Interquartile range (IQR)38.02359

Descriptive statistics

Standard deviation17.608565
Coefficient of variation (CV)0.63309714
Kurtosis-0.92343398
Mean27.81337
Median Absolute Deviation (MAD)4.5594673
Skewness-0.71795914
Sum121961.63
Variance310.06157
MonotonicityNot monotonic
2022-12-15T11:44:58.653150image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1136
 
25.9%
29.99597168 12
 
0.3%
39.99707031 10
 
0.2%
57.99245453 8
 
0.2%
59.99194336 7
 
0.2%
44.99578857 6
 
0.1%
34.99468994 6
 
0.1%
24.99725342 3
 
0.1%
36.11528015 3
 
0.1%
49.99450684 3
 
0.1%
Other values (3180) 3191
72.8%
ValueCountFrequency (%)
0 1136
25.9%
0.0004281122528 1
 
< 0.1%
0.0005553666269 1
 
< 0.1%
0.001183174783 1
 
< 0.1%
0.003486369271 1
 
< 0.1%
0.003600863041 1
 
< 0.1%
0.008166855201 1
 
< 0.1%
0.01569584385 1
 
< 0.1%
0.02166323923 1
 
< 0.1%
0.02376453392 1
 
< 0.1%
ValueCountFrequency (%)
59.99194336 7
0.2%
59.98828125 3
0.1%
59.98324203 1
 
< 0.1%
59.98095703 1
 
< 0.1%
59.97058105 1
 
< 0.1%
59.96425629 1
 
< 0.1%
59.96066284 1
 
< 0.1%
59.95888519 1
 
< 0.1%
59.95774841 1
 
< 0.1%
59.95323944 1
 
< 0.1%

FREQUENCIA BOMBA 3
Real number (ℝ)

Distinct638
Distinct (%)14.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.4058755
Minimum0
Maximum59.988281
Zeros3687
Zeros (%)84.1%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:44:59.029214image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile50.017325
Maximum59.988281
Range59.988281
Interquartile range (IQR)0

Descriptive statistics

Standard deviation16.765124
Coefficient of variation (CV)2.6171479
Kurtosis3.2320295
Mean6.4058755
Median Absolute Deviation (MAD)0
Skewness2.2674761
Sum28089.764
Variance281.06937
MonotonicityNot monotonic
2022-12-15T11:44:59.276361image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3687
84.1%
57.98879242 36
 
0.8%
0.1318343282 10
 
0.2%
59.98828125 7
 
0.2%
39.9934082 5
 
0.1%
54.98956299 4
 
0.1%
29.99230957 3
 
0.1%
0.1245101988 2
 
< 0.1%
34.99102783 2
 
< 0.1%
0.1272614598 1
 
< 0.1%
Other values (628) 628
 
14.3%
ValueCountFrequency (%)
0 3687
84.1%
0.0001107446224 1
 
< 0.1%
0.001534604002 1
 
< 0.1%
0.005035049282 1
 
< 0.1%
0.006258170586 1
 
< 0.1%
0.007479577791 1
 
< 0.1%
0.009604785591 1
 
< 0.1%
0.0099241063 1
 
< 0.1%
0.01236863434 1
 
< 0.1%
0.01295140106 1
 
< 0.1%
ValueCountFrequency (%)
59.98828125 7
0.2%
59.96926498 1
 
< 0.1%
59.95925522 1
 
< 0.1%
59.94100571 1
 
< 0.1%
59.87623978 1
 
< 0.1%
59.86713791 1
 
< 0.1%
59.86031342 1
 
< 0.1%
59.85322952 1
 
< 0.1%
59.85174561 1
 
< 0.1%
59.83979797 1
 
< 0.1%
Distinct4372
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2355002
Minimum0.29407585
Maximum4.4049139
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:44:59.554646image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.29407585
5-th percentile1.9517602
Q12.7917631
median3.30353
Q33.7749107
95-th percentile4.2591256
Maximum4.4049139
Range4.1108381
Interquartile range (IQR)0.98314762

Descriptive statistics

Standard deviation0.69706843
Coefficient of variation (CV)0.21544379
Kurtosis-0.032666725
Mean3.2355002
Median Absolute Deviation (MAD)0.4901371
Skewness-0.55811943
Sum14187.668
Variance0.48590439
MonotonicityNot monotonic
2022-12-15T11:44:59.800889image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 4
 
0.1%
4.300000191 4
 
0.1%
3.920138836 2
 
< 0.1%
4.047839642 2
 
< 0.1%
3.236786366 2
 
< 0.1%
4.295138836 2
 
< 0.1%
3.432870388 2
 
< 0.1%
3.836458206 2
 
< 0.1%
3.001000643 2
 
< 0.1%
3.768044233 1
 
< 0.1%
Other values (4362) 4362
99.5%
ValueCountFrequency (%)
0.2940758467 1
< 0.1%
0.3723406196 1
< 0.1%
0.4662296474 1
< 0.1%
0.4815826118 1
< 0.1%
0.8559085727 1
< 0.1%
0.8737350106 1
< 0.1%
0.8987794518 1
< 0.1%
0.9570472836 1
< 0.1%
0.9613910913 1
< 0.1%
0.9710686803 1
< 0.1%
ValueCountFrequency (%)
4.404913902 1
< 0.1%
4.403257847 1
< 0.1%
4.401571751 1
< 0.1%
4.401538372 1
< 0.1%
4.401222229 1
< 0.1%
4.400625229 1
< 0.1%
4.398981094 1
< 0.1%
4.398623466 1
< 0.1%
4.397883892 1
< 0.1%
4.39741993 1
< 0.1%

VAZÃO DE ENTRADA- FT01
Real number (ℝ)

Distinct1922
Distinct (%)43.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.66294
Minimum0
Maximum383.87036
Zeros795
Zeros (%)18.1%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:45:00.058739image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.11574074
median0.11574074
Q3264.27155
95-th percentile280.08785
Maximum383.87036
Range383.87036
Interquartile range (IQR)264.1558

Descriptive statistics

Standard deviation132.60141
Coefficient of variation (CV)1.1769746
Kurtosis-1.8552865
Mean112.66294
Median Absolute Deviation (MAD)0.11574074
Skewness0.34275668
Sum494026.98
Variance17583.135
MonotonicityNot monotonic
2022-12-15T11:45:00.365645image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1157407388 1652
37.7%
0 795
18.1%
0.2314814776 4
 
0.1%
1.273148179 3
 
0.1%
276.5046387 2
 
< 0.1%
261.458313 2
 
< 0.1%
264.6990662 2
 
< 0.1%
276.2731628 2
 
< 0.1%
263.541687 2
 
< 0.1%
0.3472222388 2
 
< 0.1%
Other values (1912) 1919
43.8%
ValueCountFrequency (%)
0 795
18.1%
0.04144435376 1
 
< 0.1%
0.0578703694 1
 
< 0.1%
0.1157407388 1652
37.7%
0.1275791973 1
 
< 0.1%
0.1383209676 1
 
< 0.1%
0.1736586094 1
 
< 0.1%
0.176884532 1
 
< 0.1%
0.1776106358 1
 
< 0.1%
0.1835666746 1
 
< 0.1%
ValueCountFrequency (%)
383.8703613 1
< 0.1%
381.5904236 1
< 0.1%
376.8986206 1
< 0.1%
374.4212952 1
< 0.1%
370.3518372 1
< 0.1%
367.4073792 1
< 0.1%
366.7018738 1
< 0.1%
366.4682617 1
< 0.1%
365.7449341 1
< 0.1%
364.6219177 1
< 0.1%

VAZÃO DE GRAVIDADE - FT02
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct4114
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.94359
Minimum0
Maximum326.1713
Zeros272
Zeros (%)6.2%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:45:00.658554image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1123.97898
median136.00012
Q3148.20116
95-th percentile179.92901
Maximum326.1713
Range326.1713
Interquartile range (IQR)24.222176

Descriptive statistics

Standard deviation44.78165
Coefficient of variation (CV)0.336847
Kurtosis4.9098637
Mean132.94359
Median Absolute Deviation (MAD)12.126892
Skewness-0.67329216
Sum582957.65
Variance2005.3962
MonotonicityNot monotonic
2022-12-15T11:45:00.904480image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 272
 
6.2%
108.8570862 1
 
< 0.1%
141.1200867 1
 
< 0.1%
140.4069519 1
 
< 0.1%
128.144989 1
 
< 0.1%
125.5947037 1
 
< 0.1%
153.2140656 1
 
< 0.1%
143.0213776 1
 
< 0.1%
144.0009308 1
 
< 0.1%
146.4695892 1
 
< 0.1%
Other values (4104) 4104
93.6%
ValueCountFrequency (%)
0 272
6.2%
27.51053429 1
 
< 0.1%
30.12467003 1
 
< 0.1%
30.16630745 1
 
< 0.1%
30.91310501 1
 
< 0.1%
31.05513191 1
 
< 0.1%
41.97084808 1
 
< 0.1%
56.65369415 1
 
< 0.1%
56.89728165 1
 
< 0.1%
57.67729187 1
 
< 0.1%
ValueCountFrequency (%)
326.1712952 1
< 0.1%
324.9286499 1
< 0.1%
322.9801636 1
< 0.1%
320.3776245 1
< 0.1%
304.5761719 1
< 0.1%
302.353302 1
< 0.1%
302.0870361 1
< 0.1%
301.3987427 1
< 0.1%
300.1445923 1
< 0.1%
299.913208 1
< 0.1%

VAZÃO DE RECALQUE - FT03
Real number (ℝ)

Distinct4114
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.40697
Minimum0
Maximum194.35185
Zeros24
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:45:01.158808image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.028935185
Q1111.65463
median118.82233
Q3125.62949
95-th percentile136.53865
Maximum194.35185
Range194.35185
Interquartile range (IQR)13.974861

Descriptive statistics

Standard deviation31.328318
Coefficient of variation (CV)0.27870442
Kurtosis7.2635604
Mean112.40697
Median Absolute Deviation (MAD)6.9732361
Skewness-2.6536729
Sum492904.54
Variance981.46349
MonotonicityNot monotonic
2022-12-15T11:45:01.407483image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0289351847 226
 
5.2%
0 24
 
0.5%
117.8530121 3
 
0.1%
119.1956024 2
 
< 0.1%
132.5810242 2
 
< 0.1%
113.6284714 2
 
< 0.1%
114.9884186 2
 
< 0.1%
110.1273193 2
 
< 0.1%
132.378479 2
 
< 0.1%
117.2742996 2
 
< 0.1%
Other values (4104) 4118
93.9%
ValueCountFrequency (%)
0 24
 
0.5%
0.01446759235 1
 
< 0.1%
0.0289351847 226
5.2%
0.2875666618 1
 
< 0.1%
0.3858614862 1
 
< 0.1%
0.4217657745 1
 
< 0.1%
0.5559648871 1
 
< 0.1%
0.6147419214 1
 
< 0.1%
0.69016397 1
 
< 0.1%
0.8436223269 1
 
< 0.1%
ValueCountFrequency (%)
194.3518524 1
< 0.1%
189.3903809 1
< 0.1%
188.7152863 1
< 0.1%
185.4285889 1
< 0.1%
183.7471924 1
< 0.1%
183.2344971 1
< 0.1%
181.8721008 1
< 0.1%
180.8506927 1
< 0.1%
179.4629211 1
< 0.1%
177.5573273 1
< 0.1%
Distinct4364
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1077789
Minimum0.87751222
Maximum5.6827645
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:45:01.660403image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.87751222
5-th percentile2.7513295
Q13.6190748
median4.147234
Q34.6634259
95-th percentile5.2179035
Maximum5.6827645
Range4.8052523
Interquartile range (IQR)1.0443511

Descriptive statistics

Standard deviation0.76275458
Coefficient of variation (CV)0.1856854
Kurtosis0.24480127
Mean4.1077789
Median Absolute Deviation (MAD)0.52290487
Skewness-0.41059959
Sum18012.61
Variance0.58179454
MonotonicityNot monotonic
2022-12-15T11:45:01.890048image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.541088104 7
 
0.2%
5.532407284 6
 
0.1%
5.636573792 4
 
0.1%
5.55555582 3
 
0.1%
5.53819418 2
 
< 0.1%
4.202215195 2
 
< 0.1%
4.284318924 2
 
< 0.1%
5.182291985 2
 
< 0.1%
4.845046043 2
 
< 0.1%
5.092852592 1
 
< 0.1%
Other values (4354) 4354
99.3%
ValueCountFrequency (%)
0.8775122166 1
< 0.1%
0.8825973868 1
< 0.1%
0.8876825571 1
< 0.1%
0.8906169534 1
< 0.1%
0.892767787 1
< 0.1%
0.8949127793 1
< 0.1%
0.8992086649 1
< 0.1%
0.9035044909 1
< 0.1%
1.519142866 1
< 0.1%
1.598580122 1
< 0.1%
ValueCountFrequency (%)
5.68276453 1
< 0.1%
5.668966293 1
< 0.1%
5.668751717 1
< 0.1%
5.666208267 1
< 0.1%
5.665445328 1
< 0.1%
5.663450718 1
< 0.1%
5.661117077 1
< 0.1%
5.660072803 1
< 0.1%
5.656788349 1
< 0.1%
5.652602673 1
< 0.1%

PRESSÃO DE RECALQUE - PT02
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct1992
Distinct (%)45.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.794268
Minimum0
Maximum28.084936
Zeros79
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size68.5 KiB
2022-12-15T11:45:02.129138image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.082886364
Q121.721619
median22.048611
Q323.017941
95-th percentile25.954861
Maximum28.084936
Range28.084936
Interquartile range (IQR)1.2963219

Descriptive statistics

Standard deviation6.1425533
Coefficient of variation (CV)0.29539647
Kurtosis6.1429142
Mean20.794268
Median Absolute Deviation (MAD)0.96932983
Skewness-2.6683162
Sum91182.864
Variance37.730961
MonotonicityNot monotonic
2022-12-15T11:45:02.349734image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22.00520706 216
 
4.9%
23.00347328 183
 
4.2%
22.9745369 155
 
3.5%
23.01794052 145
 
3.3%
22.01967621 145
 
3.3%
21.97627258 140
 
3.2%
22.98900414 103
 
2.3%
22.04861069 102
 
2.3%
23.046875 87
 
2.0%
0 79
 
1.8%
Other values (1982) 3030
69.1%
ValueCountFrequency (%)
0 79
1.8%
0.002456213813 1
 
< 0.1%
0.01011194568 1
 
< 0.1%
0.01062385458 1
 
< 0.1%
0.01230416168 1
 
< 0.1%
0.01609536819 1
 
< 0.1%
0.01681616344 1
 
< 0.1%
0.0184647534 1
 
< 0.1%
0.01964788325 1
 
< 0.1%
0.02034074813 1
 
< 0.1%
ValueCountFrequency (%)
28.08493614 1
 
< 0.1%
28.05792236 1
 
< 0.1%
28.04745102 1
 
< 0.1%
28.04636955 1
 
< 0.1%
28.03819466 2
 
< 0.1%
28.02372551 5
0.1%
28.01240349 1
 
< 0.1%
28.00926018 5
0.1%
28.00657845 1
 
< 0.1%
28.00614929 1
 
< 0.1%

Interactions

2022-12-15T11:44:55.016296image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:40.159878image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:41.947192image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:43.760020image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:45.778078image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:47.423870image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:49.271665image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:51.346058image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:53.259351image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:55.278733image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:40.358475image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:42.142131image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:43.952452image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:45.963020image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:47.628031image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:49.627555image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:51.560991image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:53.451291image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:55.525658image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:40.563413image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:42.334069image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:44.266355image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:46.143965image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:47.830969image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:49.838490image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:51.776925image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:53.644232image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:55.730594image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:40.759354image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:42.535383image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:44.464295image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:46.328355image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:48.033923image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:50.039929image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:52.007853image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:53.834496image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:55.921537image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:40.941354image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:42.717342image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:44.640426image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:46.487307image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:48.217866image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:50.219405image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:52.213789image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:54.007440image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:56.306419image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:41.160287image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:42.951269image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:44.895349image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:46.688064image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:48.439917image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:50.431339image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:52.439719image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:54.224496image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:56.506603image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:41.355249image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:43.159205image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:45.180260image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:46.872865image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:48.646153image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:50.667267image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:52.647612image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:54.418326image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:56.736531image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:41.573181image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:43.373139image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:45.401192image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:47.072802image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:48.866086image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:50.923187image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:52.866039image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:54.638983image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:56.923698image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:41.756031image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:43.561083image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:45.590134image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:47.244926image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:49.059028image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:51.142120image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:53.058413image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-15T11:44:54.815357image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-12-15T11:45:02.549762image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-15T11:45:03.092282image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-15T11:45:03.402187image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-15T11:45:03.718789image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-15T11:45:04.029414image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-15T11:44:57.207610image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-15T11:44:57.589495image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

FREQUENCIA BOMBA 1FREQUENCIA BOMBA 2FREQUENCIA BOMBA 3NIVEL DO RESERVATÓRIO - LT01VAZÃO DE ENTRADA- FT01VAZÃO DE GRAVIDADE - FT02VAZÃO DE RECALQUE - FT03PRESSÃO DE SUCÇÃO - PT01PRESSÃO DE RECALQUE - PT02
Timestamp
2018-01-01 18:00:0049.4036560.00.03.7467440.000000108.85708687.4492654.80928516.963104
2018-01-01 19:00:0052.1547510.00.03.797072280.593262110.51438994.7777484.87331118.046513
2018-01-01 20:00:0051.2842710.00.03.917960279.348541109.42292891.4481125.00190417.925348
2018-01-01 21:00:0050.2099230.00.04.035455276.239441116.94828891.8373645.11641717.015503
2018-01-02 18:00:0056.7693250.00.04.082773279.826141128.389175105.9576195.07709820.014168
2018-01-02 19:00:0057.2721290.00.04.1079900.000000125.107101105.2980885.07943720.960829
2018-01-02 20:00:0056.6361890.00.03.7299050.000000127.512062103.8570944.70423220.047829
2018-01-02 21:00:0057.4628300.00.03.3520010.000000126.796616106.3528904.31103320.046518
2018-01-03 18:00:0057.9359210.00.04.262936269.974426131.094788110.5801095.21815120.043404
2018-01-03 19:00:0059.1616970.00.04.2541990.000000129.144516112.7507715.17633121.072531
FREQUENCIA BOMBA 1FREQUENCIA BOMBA 2FREQUENCIA BOMBA 3NIVEL DO RESERVATÓRIO - LT01VAZÃO DE ENTRADA- FT01VAZÃO DE GRAVIDADE - FT02VAZÃO DE RECALQUE - FT03PRESSÃO DE SUCÇÃO - PT01PRESSÃO DE RECALQUE - PT02
Timestamp
2020-12-29 20:00:0057.70775645.8284990.0000003.8102680.115741140.620041115.8666534.68899622.989004
2020-12-29 21:00:0057.68394145.0265240.0000003.4028140.115741132.692459108.4861304.34239122.019676
2020-12-30 18:00:0057.98879246.4209290.0000003.6676840.115741149.632156118.4895784.52504723.017941
2020-12-30 19:00:0057.98879247.1734540.0000003.2351720.115741149.119476120.3891684.07188823.017941
2020-12-30 20:00:0057.98879247.5555690.0000002.8057080.115741140.062714117.8845833.65994023.017941
2020-12-30 21:00:0057.98879246.1978230.0000002.487020274.195374132.376358108.8831023.38259022.066475
2020-12-31 18:00:0057.98879245.8584480.0000004.1224050.115741154.669449118.0989534.97634923.046875
2020-12-31 19:00:0057.98879246.7978360.0000003.6696130.115741149.436783121.8148194.49901123.046875
2020-12-31 20:00:0057.98879247.0611920.0000003.2212170.115741151.018051117.9333884.07082623.046875
2020-12-31 21:00:000.00000045.86958357.9887922.8034980.115741129.294189106.6929323.71423521.976273

Duplicate rows

Most frequently occurring

FREQUENCIA BOMBA 1FREQUENCIA BOMBA 2FREQUENCIA BOMBA 3NIVEL DO RESERVATÓRIO - LT01VAZÃO DE ENTRADA- FT01VAZÃO DE GRAVIDADE - FT02VAZÃO DE RECALQUE - FT03PRESSÃO DE SUCÇÃO - PT01PRESSÃO DE RECALQUE - PT02# duplicates
00.00.00.03.9201390.1157410.00.0289355.1822920.02
10.00.00.04.2951390.1157410.00.0289355.5324070.02